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• ' Abstract 

O ' This paper presents a general analysis of evolutionary algorithms for solving hard and easy fitness 

functions. Two results are proven in the paper: (1) using lower selection pressure is better for solving 

hard fitness functions; (2) the strong cut-of point is 1 for solving any easy fitness function, which means 

^ I it brings no benefit if using a population size larger than 1. 
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^ ■ 1 Introduction 

O 

^SJ ' In theory, easy and hard functions have been widely taken as benchmark problems for evaluating the per- 

formance of evolutionary algorithms (EAs). For example in pseudo-Boolean optimization, the One-Max and 
Leading-Ones functions are often regarded as easy to EAs [3 [21 [3] ; and the Fully Deceptive and Real Royal 
Road functions hard to EAs [H [S] . 

EAs' hardness is linked to both fitness functions and used EAs, and depends on features of fitness land- 
j^ I scapes, like isolation, deception and multi-modality. However a non-deceptive function may be difficult to 

an EA [6], and a deceptive function may be easy [7]; a multi- modal function may be easy-to-solve [8], and a 
unimodal function may be difficult for certain EAs but easy for others [9]. 

Different form previous work, this paper aims at providing a general analysis of the hardest and easiest 
functions in any finite search space. Because no predictive measure can efficiently evaluate the problem 
difficulty for EAs [10], thus hard and easy functions are constructed on the time-based fitness landscape. 
This paper aims at analysing these functions and answering two questions: how will selection pressure 
influence the hitting time of EAs? Where is the cut-off point? 

The rest of the paper is organized as follows: section [2] describes elitist evolutionary algorithms; section |3] 
defines hard and easy fitness functions; sections S] and [5] analyse hard and easy fitness function respectively; 
and finally a brief conclusion is given in section |6l 
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2 Elitist Evolutionary Algorithms 

Consider the problem of maximizing a fitness funetion f{x): 

/max := max{/(a;); x G S}, (!) 

wliere 5 is a finite set. Without losing generality, the fitness function takes L + 1 values /o > /i > • • ■ > /l, 
called fitness levels. For the analysis convenience, assume that all constraints have been removed by a penalty 
function method. Thus all solutions in S will be regarded as feasible solutions. 

A {^ + n) evolutionary algorithm (EA) is described in Algorithm [1] where jjl is the population size, t the 
generation counter, and $ a population which is a vector of random variables. 

Algorithm 1 {ij, + fi) Evolutionary Algorithm Ais> 
1 
2 
3 

4 
5 
6 
7 
8 
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10 



input: fitness function; 

generation counter t <^ 0; 

initialize $t; /* $t e 5^^^ */ 

while (no optimal solution is found) do 

^t+i/2 ^ is mutated from $t; /* ^t+i/2 & S^f^^ */ 

evaluate the fitness of each individual in $4+1/2; 

$t+i ^ is selected from ($t, $4+1/2); /* $t+i e S^'"'^ */ 

t^t + 1; 
end while 
output: the maximal value of the fitness function. 



In the paper, S*'^-* := S is called the individual space, the Cartesian product S'^'^^ := S x ■ ■ ■ x S the 
population space; x,y,z denote individuals, X,Y,Z populations. X consists of fi individuals (xi,--- ,Xfj,) 
with/(xi)>--->/(x^). 

An Elitist Evolutionary Algorithm (EEA) adopts global mutation and strict elitist selection operators, 
whose mathematical definitions are given below. 

• A global mutation operator A^ is a transition probability matrix from S" to S* where its entry Pm{x, y) := 
Prn{y I x) satisfies: Pm{x,y) > 0,Vx,?/ e S. 

• A strict elitist selection operator 5 is a transition probability matrix from S^^' x 5"'^^ to S^^' , whose 
entry Ps{X,Y,Z) := P,{Z \ X,Y) satisfies: 

_ { xi, if /(2/1) < f{xi), , 

''-{ yi, if/(j/i)>/(:Ei). ^^^ 

Assume that both mutation and selection operators are independent on t (called homogeneous), then the 
sequence {$t, i = 0, 1, ■ • • } can be modelled by an homogeneous Markov chain [llj . Its transition probability 
matrix is denoted by P*^^\ whose entry is 

p^''\x,Y) = p($t+i ^Y i^t^x), yx,Y e S^''\ 

Let the optimal set S'opt C 5* be the set consisting of all individuals which are optimal solutions to Problem 
([T]) and non-optimal set Snon '■— S\Sopt- Let the optimal set S^pl C S'('^-' be the set of populations containing 
at least one optimal solution and non-optimal set Snon ■= S^^' \ S'opt- 

Definition 1. The expected hitting time to the optimal set S^^l starting from the initial state X is the mean 
value of the total number of generations to find an optimal solution 

m(^)(A) := E[mm{t;<i>t G S^^l | $0 = X}]. 

The expected running time of a ((/i + fJ-) EA is the mean value of the total number of fitness evaluations, 
i.e., fim^'^HX). 



Denote m^^ := max{m(^)(X); X G S'^^^}. 



Definition 2. T/ie ratio of hitting times between a (1 + 1) EA and a (fi + ji) EA is called population 
scalability, given by 

The strong cut-ofF point of a family of (/i + /i) EAs (fi = 1,2, ■■■) is 

Mcut — max{/i; V;^ = 1, • • ■ , /x, i?(i^) > n}. (4) 

The definition of the above cut-ofF point is stronger that that in '^\ . 

In the following we give lower and upper bounds on R{fJ,) based on drift analysis [TJ[T3]. Let d{X) be a 
distance function such that 

0, liXesi^l 



diX) , 

' > 0, otherwise 



Define the one-step drift [TJ [T5] 



A^f'^diX) 

mm 
^-ima.x 



^Eresi^^ P'^''KX^Y){d{X) - d{Y)), 

^mm{M^^diXy,XeSip, 

= max{A(^U{X);Xesi^l}, 

Theorem 1. For X e 5*^^', define d{X) := min{TO(^)(a;); a; £ X}, then it holds: 

Proof. From Lemmas 1 and 2 in |13| . we know that 

(1) (1) 



^max ^min 

and then we get the conclusion. D 

3 Definition of Hardest and Easiest Fitness Functions 

In the paper, easy and hard functions are defined on time-based fitness landscapes. 

Definition 3. Given a fitness function f{x) and a (1 + 1) EA, its associated time-based fitness landscape 
is the set of pairs {{m{x), f{x));x G S}. 

If yx ^ 5'opt,y ^ S'opt, m{x) < m{y) <^ f{x) > f{y), then the time-based fitness landscape is called 
monotonic. 

If yx ^ 5opt,2/ ^ Sopt, m{x) < m{y) <^ fix) < f{y), then the time-based fitness landscape is called 
deceptive. 

Given a (1-1-1) EEA and the family of fitness functions with the same optimal set Sopt, we construct the 
hardest fitness function fuix) in this family as follows. 

1. Let 5*0 = Sopt- Define mnix) = 0,Va; G Sq. 

2. Let 5*1 be the set of all individuals such that 

f 1 1 

arg max 



Define VzG^i, 

I 1 

mniz) = max 



Because Vx, y : Pm{x, y) > 0, we know the value of mniz) always exists. 



3. Assume that set Sk has been defined. Let Sk+i be the set of all individuals such that 

[ 1 + ELo Eyes*, ^™(2^' y)mH{y) \ 
arg max < -, > . 

Define Vz G 5*^+1, 

mHyz) = max < ; — > . 

Because Vx, y : Pm{x, y) > 0, we know the value of mH{z) always exists. 

4. Repeat Step 3 until all individuals fall into some set Sk- That is for some integer L > 0, 5 = U^^Q^fc. 

5. Define fnix) be a fitness function such that 

• fH{x) = fi,yx e Si,l = 0,--- , L; 
. /,>/,+i,V/ = 0,---,i-l. 

The following theorem shows that fnix) is the set of hardest fitness functions. 

Theorem 2. Let {$t} denote the Markov chain associated with the (1+1) EEA for maximizing fnix), and 
{^f} the chain for maximizing another function f{x) with the same optimal set. Then m^{x) > m^i{x),\/x G 
S. 

The time-has ed fitness landscape associated with fnix) and the (1 + 1) EEA is deceptive^ 

Similarly we can construct the easiest function fsix) as follows: 

1. Let 5*0 = Sopt- Define Vx e So,mE{x) = 0. 

2. Let 5*1 be the set of all individuals such that 

1 
arg mm 

Define Vz e Si, 

r:eS\S„ [T,y<ESoP"iix,y) 

3. Assume that set Sk is defined. Let Sk+i be the set of all individuals such that 

1 + Ej=o EyeSfc ^™(a^. y)mEiy) \ 




Define Vz G •S'fc+i, 



arg mm , , 

-6S\utoS4 EtoEyeS^Pmix^y) 



mE(z) = mm -^ ^^^j — 



^es\uus,[ E:=,Eyes,Pmix,y) J 

Repeat Step 3 until all individuals fall into some set Sk- That is for some integer L > 0, S — U^^gS'fc. 
4. Define fsix) be a fitness function such that 
• fsix) ^ fi,\/x e Si,l ^O,--- , L; 



^Its proof is given in the article: J. He and T. Chen, A General Analysis of Super Linear Scalability in Population-based 
Random Search Using Spectral Radius of the Fundamental Matrix, submitted to IEEE Transactions on Evolutionary Compu- 
tation. 



. /,>/,+i,V/-0,---,L-l. 

Theorem 3. Let {$*} denote the Markov chain associated with the (1+1) EEA for maximizing fsix), and 
{\l/t} the chain for maximizing any function f{x) with the same optimal set. Then m^{x) < m^,{x),\/x £ S. 
The time-based fitness landscape associated with fE{x) and the (1 + 1) EEA is monotonic. 

Furthermore, if a time-based fitness landscape is deceptive, tlien the related fitness function is called hard; 
if a time-based fitness landscape is monotonic, then the related fitness function is called easy. The family of 
these hard and easy functions are called hard-easy (HE) functions. 

4 Analysis of Hard HE Functions 

Intuitively using lower selection pressure will be better for hard HE fitness functions. The following theorem 
shows this is true for (1 + 1) EAs. 

Theorem 4. Given two (1 + 1) EAs, using mutation operator M., for maximizing a hard HE fitness function 
Ie{x). 

1. (1 + 1)-EA A<i uses non-elitist selection operator 5$ such that 

Ps{x I x,y) ^ Ps{y I x,y). 

2. (1 + 1)-EA A^i uses non-elitist selection operator iS,p such that 

P.ix \x,y)> Ps{y I x,y), if f{x) > f{y), 
Ps{y \x,y)> Ps{x I x,y), if f{y) > f{x). 

Then m$(a;) < to,j,(x). 

Proof. Define a distance function d(x) = m,p{x),\/x S S. 

For the Markov chain {$4}, A$d(a::) is the sum of the positive drift A^d{x) and negative drift A'^d{x) 
[T5] . By Lemma 3 in [T3], we know Vx e S" 

A$d(x) = ^P$(a;, y)(m$(a;) - m^{y)) = 1. 
y 

For the Markov chain {^t}, A^id{x) is the sum of the positive drift Ajd(x) and negative drift A^d{x): 

Aq,d{x) = A+d{x) + A^d{x). 

Let's estimate the positive drift of the Markov chain {^t}. Because the fitness landscape is deceptive, we 
get 

^id{x) = Ey:rf>,(:r)>d*fe)^*(a;,2/)(c?*(a;)-d*(y)) 

= J2y:,n^ix)>m^iy) P-^i^, y)("l<I.(x) - m^{y)) 

= A+dix) 

Next let's estimate the negative drift of the Markov chain {^t}. Because the fitness landscape is deceptive, 
we get 

^^d{x) = Ey:d^(x)<d^{y)P^i^^y)id'i'ix)~d^{y)) 

= Ey:m^(:r)<m^(y)^*(^'y)("^*(a;) - W$(y)) 
> T.y:rn^ix)<rn^iy)P'S>{x,y){m^ix) - m^{y)) 

= A^d(a;). 
Then the total drift Aqid{x) satisfies 

A^d{x) > A^d{x) = 1. 

From Lemma 2 in [13] . we get that Vx G S, m,if{x) < m$(a;). D 



The above result can be generalized to {fj, + fj,) EAs where ^ > 1 . 

Let $ = {(f)i , ■ • • , 0^) be a vector of random variables, define its time-based cumulative distribution function 
to be 

F^idi,--- ,df,)=P{m{cj)i) <di,--- ,mi(j)^)<d^). (5) 

If F^{di,- ■ ■ ,dfj) > F^{di, ■ ■ ■ ,(ip), then <!> is called to dominate ^. 

Definition 4. Mutation operator A4 is called regular if ^t+i/2 dominates '^t+i/2 given that $t dominating 
^t ■ 

Definition 5. Selection operator S<^ is called superior to selection operator S^ if^t+i dominates ^t+i given 
that populations $t dominating ^j and ^t+i/2 dominating '9t+i/2- 

Theorem 5. // mutation operator Ai is regular, and selection operator 5$ is superior to selection operator 
S-i,, then the hitting time of the (/i + fi) EA using M. and iS$ is no more than that of the (/i + /i) EA using 
M and S^j, . 

Proof. Let $o = ^o = ^ be the initial population. If mutation operator A4 is regular, and selection operator 
iS$ is superior to selection operator 5* , then 

Pi<i>t e si;l I $0 = X) > P(*, e si;l I vPo = X),vt > 0. 

From the above inequality and following identities 

then we get rn.^ < rn-!^ . D 

For a hard HE function, the cumulative distribution function can be rewritten as follows: 

Pifi(t>l) < dl, • • • , /((^m) < d^.) + P{f{<i>l) = /max V • ■ • V /(0^) = /,„ax). (6) 

The intuitive meaning of mutation regularity is explained as follows: if $t has a smaller fitness than ^f, 
then after mutation, $(+1/2 still has a smaller fitness than $1+1/2 unless they reach the optimal set. 

The intuitive meaning of selection superiority is explained as follows: if $* has a smaller fitness than VP*, 
and $t+i/2 bas a smaller fitness than ^1+1/2, then the population $t+i (selected from $t and ^t+1/2 by 5$) 
has a smaller fitness than the population ^t+i (selected from ^t and ^t+1/2 by S^), unless they reach the 
optimal set. In other words, using lower selection pressure is better for solving hard HE fitness functions. 

5 Analysis of Easy HE Fitness Functions 

For easy HE functions, using a population size larger than 1 does not bring any benefit. This is given in the 
following theorem. 

Theorem 6. Given any family of (fj, + /i) EEAs for solving any easy HE fitness functions, its strong cut-off 
point is 1. 

Proof. For the (1 + 1) EEA, define the distance function d{x) := m^^'{x). 
From Lemma 3 in [13], we know for the (1+1) EEA 

Adix)^ Y, P^'\x,y){d{x)^d{y)) = l. (7) 

For a {n + 11) EA where /i > 2 and X = {xi, • • • , x^), define 

d{X) := min{d(2;); a; e X} = d{xi). (8) 
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Let $t = X = (xi,--- ,Xf,), ^t+i/2 = Y = {yir-- ,Vti) and $(+i = Z = (zi,--- ,z^). The drift is 

2:21 >^l 

Since the selection operator is strict elitist, we only need to analyse the case of /(j/i) > f{xi). So we have 

zi = yi- 

Since the time-based fitness landscape is monotonic, then we know that no negative drift can happen. A 
positive drift happens only if one of individuals in X generates a child individual y such that f{y) > f{xi). 

Fixing 2/1 with f{yi) > f{xi). Let P„i{x,yi) denotes the probability of the individual x generating the 
offspring j/i through mutation. 

Since each individual is mutated independently, thus the probability for X to generate Y (where f{yi) > 
f{xi)) satisfies 

P,i>^\X,Y)<Y,Pn^ix^,yl). 
i=\ 

The positive drift from X loY with /(yi) > /(xi) is less than 

p'£\x,Y){d{X)-d{Y)) < E-=iPm{x,^yi)id{xi)-d{y,)) 

< J2'^=iPm{Xi,yi){d{xi)-d{yi)). 

where the second inequality is drawn from d{xi) < d{xi),Wi = 2, ■ • • ,/i. 

Summing all possible yi with f{yi) > f{xi), we get the total positive drift 

A+diX) = Y.Z:fi.,)>K.,)P'^'KX,Z){d{X)-d{Z)) 

= i:Y:fiy,)>fi.,)P^\X,Y)(d{X)-d(Y)) 

< T,Y■.fiy^)>fix^)J2LlPmix^,yl)id{Xl) - diyi)) 

< J2Ll'Ey,:fiy,)>f{x^)P'n{Xi,yi){dix,)-d{yi)). 

Since the event of the individual Xi generating the child yt — yi through mutation is included in the event 
of Xi generating a child yi with /(yi) > fijji) > f{xi), then we have 

S!/i:/tel)>/(:ri) Pni{Xr,yi){d{Xi) - d{y{)) 
< Y.y,:f{y,)>f{x,)PmiXz,yz)id{xi) - d{y^)) . 



Then we get V/i > 2 






where the second inequality is derived from equality ([T]). 

Applying Lemma 2 in [13) . we obtain V/i > 2, m}^ > rrioo / ^■t, which proves the strong cut-off point is 
1. D 

Among all (1-t-l) EAs, the (1 + 1) EEA is the best for easy HE functions. 

Theorem 7. Given two (1 + 1) EAs, using global mutation operator A4, for maximizing an easy HE fitness 
function: 

1. {l + l)-EEA A^. 

2. (1 + 1)-EA Aii using any non- elitist selection operator. 
Then \fx G S, m$(x) < mqi{x). 



Proof. Define the distance function d{x) = m$ {x) ,Vx & S. 

For the Markov chain {^t}, according to Lemma 3 in [13 , we know 

Aq,d{x) — yjP$(x,y)(TO$(a;) — ■m^{y)) = l,Wx e S. 
y 

Noting that for the Markov chain {$t}, the negative drift is 0. 

For the Markov chain {^t}, the drift A^d{x) is the sum of the positive drift A'^d{x) and negative drift 
A^d{x): 

A^d{x) = A+dix) + A^d{x). 

Let's estimate the positive drift of the Markov chain {^t}. Because the time-based fitness landscape is 
monotonic, we get 

^^d{x) = Ey:d<,(x)>d^{y)P'i'i^^y)id-i'{x)-d^,{y)) 

= Ea:m^(:r)>m*(y)-P*(3^'2/)('^<i'(2;) -™<I>(2/)) 

= A+d{x). 
Next let's estimate the negative drift of the Markov chain {^t}: 

= Ey:m^(x)<m^(y) ^^i^^ y){m^ix) " m^{y)) 

< A^dix)=0. 



Thus A^d{x) < 1. 

Then from Lemma 1 in fl3j, we get that Va; G S,m,}j{x) > m^{x). 



n 



6 Conclusion 

This paper presents a general analysis of EAs for solving the hardest and easiest fitness functions. The 
hardest and easiest functions are defined on the time-based fitness landscape for any finite set. The fitness 
function related to a deceptive time-based fitness landscape is regarded as hard; and the fitness function 
related to a monotonic time-based fitness landscape is easy. 

It is proven that using lower selection pressure is better for solving hard HE fitness functions. It is also 
proven that the strong cut-of point of any (/x -I- /i) EEA family is 1 when solving any HE easy fitness function. 
This means it brings no benefit if using a population size larger than 1. 
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